Skip to content

Retry streaming read errors in bundle download#98

Merged
kislyuk merged 4 commits intomasterfrom
akislyuk-retry-streaming-read-errors
Apr 2, 2018
Merged

Retry streaming read errors in bundle download#98
kislyuk merged 4 commits intomasterfrom
akislyuk-retry-streaming-read-errors

Conversation

@kislyuk
Copy link
Copy Markdown
Member

@kislyuk kislyuk commented Apr 2, 2018

Could use some tests and also a warning in the client's streaming API autodoc, but I'm not sure how I would set up a test case, and I'd like to postpone docs until I verify that this works in the download stress tests.

Comment thread hca/dss/__init__.py Outdated
if chunk:
fh.write(chunk)
while True:
dest_fh.write(fh.raw.read(1024*1024))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you want to terminate this loop at some point.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Still new at this programming thing.

Comment thread hca/dss/__init__.py
else:
logger.error("%s", "File {}: GET FAILED.".format(filename))
logger.error("%s", "Response: {}".format(response.text))
finally:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think you still want the finally clause.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR switches from using requests.Response.iter_content (the questionable streaming read wrapper in requests) to the Swagger client context manager, which incorporates the cleanup logic:

def stream(self, **kwargs):
self._context_manager_response = self._request(kwargs, stream=True)
return self
def __enter__(self, **kwargs):
assert self._context_manager_response is not None
return self._context_manager_response
def __exit__(self, exc_type, exc_val, exc_tb):
self._context_manager_response.close()
self._context_manager_response = None

With this logic in place, the finally clause should no longer be necessary.

@codecov-io
Copy link
Copy Markdown

codecov-io commented Apr 2, 2018

Codecov Report

Merging #98 into master will decrease coverage by 0.07%.
The diff coverage is 76.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #98      +/-   ##
==========================================
- Coverage      88%   87.92%   -0.08%     
==========================================
  Files          29       29              
  Lines         967      969       +2     
==========================================
+ Hits          851      852       +1     
- Misses        116      117       +1
Impacted Files Coverage Δ
hca/dss/__init__.py 90.47% <76.92%> (-0.99%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b4ddfe2...7e641f9. Read the comment docs.

Copy link
Copy Markdown
Member

@ttung ttung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comments.

Comment thread hca/dss/__init__.py

file_path = os.path.join(dest_name, filename)

retries = self.get_session().adapters["https://"].max_retries
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure I agree with this, but I don't know what's better.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is retrieving the retry policy that the request logic ought to apply, and applying it manually. Arguably all this machinery might be better off in the client class, but we can move it there later. Do you have an issue with the policies being the same, or...?

Comment thread hca/dss/__init__.py Outdated
from io import open

import requests
from requests.packages.urllib3.exceptions import ProtocolError, DecodeError, ReadTimeoutError, MaxRetryError
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls remove MaxRetryError.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@kislyuk kislyuk merged commit 2cc8168 into master Apr 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants